Household Enterprise Entry as a Coping Mechanism Under Agricultural Price Shocks: An Agent-Based Model Calibrated to LSMS-ISA Panel Data

Author
Affiliation

Michael Bean

Independent Researcher

Published

January 14, 2026

Abstract

This paper presents an agent-based model (ABM) investigating household enterprise entry as a coping mechanism in response to agricultural price shocks in Sub-Saharan Africa. Using household panel data from the Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA) for Tanzania and Ethiopia, we calibrate distributional parameters for assets, price exposure, and credit access. The model implements household agents that make enterprise participation decisions based on rule-based policies, with an LLM-augmented decision architecture designed for future evaluation. Through parameter sweeps and behavior exploration across calibrated synthetic households, we examine the sensitivity of enterprise dynamics to policy thresholds. Our findings demonstrate that the ABM reproduces key patterns observed in the empirical data, including enterprise prevalence trends and path-dependent household trajectories. We discuss the model’s limitations, including the absence of agent-agent interactions and the exploratory nature of the current computational analysis, and outline directions for more rigorous validation.

1 Introduction

Agricultural price volatility poses significant risks to rural households in developing economies. When cash crop prices decline, households face reduced income and may adopt various coping strategies to smooth consumption (Dercon 2002). One such strategy is enterprise entry—the initiation of non-farm business activities as an alternative income source. Understanding the dynamics of enterprise entry under price shocks has important implications for rural development policy and poverty reduction strategies.

This paper develops an agent-based model (ABM) to investigate the relationship between agricultural price shocks and household enterprise participation in Sub-Saharan Africa. The model is calibrated to household panel data from the Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA) for Tanzania (2008-2014) and Ethiopia (2011-2015) (World Bank 2024).

Research Question: Do negative agricultural price shocks induce enterprise entry as a coping mechanism, and how do household characteristics (assets, credit access) mediate this relationship?

Contributions: This work makes three contributions to the literature:

  1. Calibrated Microsimulation: We develop a generative microsimulation approach where distributional parameters (assets, shocks, credit) are fitted from empirical data, enabling synthetic panel generation that preserves key statistical properties.

  2. Pattern-Oriented Validation: Following Pattern-Oriented Modeling principles (Grimm et al. 2010; Railsback and Grimm 2019), we validate the model against multiple empirical patterns rather than single metrics.

  3. LLM-Augmented Policy Architecture: We design (but do not yet execute) an architecture for LLM-based household decision-making, contributing to the emerging literature on AI-augmented social simulation.

Scope and Limitations: The current model does not include direct agent-agent interactions; households respond independently to shared exogenous price shocks. Aggregate patterns emerge from heterogeneous individual responses, not from complex adaptive dynamics. This design choice reflects the empirical setting where household enterprise decisions are primarily driven by household-level factors rather than social network effects.

3 Data

3.1 LSMS-ISA Panel Data

The model is calibrated using the Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA) harmonized panel data (World Bank 2024). We use:

  • Tanzania: 4 waves (2008-2014), N=500 households
  • Ethiopia: 3 waves (2011-2015), N=500 households

The data include household-level information on enterprise participation, asset holdings, credit access, and agricultural production. Price exposure is computed from household crop portfolios and regional price indices.

3.2 Data Processing and Derived Variables

Derived target variables include:

  • Enterprise indicator: Binary (0/1) for non-farm enterprise operation
  • Asset index: Standardized index of durable assets
  • Credit access: Binary indicator for formal credit access
  • Price exposure: Weighted average of crop-specific price changes

See docs/DATA_CONTRACT.md for the full schema specification and docs/DATA_AUDIT.md for provenance documentation.

3.3 Calibration Artifact

Distribution parameters are fitted from the LSMS data and stored in a calibration artifact (artifacts/calibration/tanzania/calibration.json). Key fitted distributions:

Table 1: Calibration parameters fitted from LSMS-ISA Tanzania data. Asset distribution uses normal family; shock distribution uses normal family. K-S goodness-of-fit tests fail due to heavy-tailed asset data (see Limitations).
Variable Distribution Mean.Rate SD.Coef K.S.p.value
Assets (raw) Normal 36727 57935 2.81e-131
Price Exposure Normal 0.056 0.132 3.22e-88
Credit Rate Logistic model 25.7% 3.56e-07
Enterprise Prevalence Empirical 25.7%

Note on K-S Tests: The Kolmogorov-Smirnov tests reject the null hypothesis of distributional fit. This is expected for heavy-tailed economic data with large sample sizes (Clauset, Shalizi, and Newman 2009). Visual inspection (QQ-plots in Appendix) and moment-matching suggest the normal approximation captures the central tendency adequately for our simulation purposes, though future work should explore alternative distributional families.

4 Model

4.1 Overview (ODD Protocol)

The model follows the ODD+D protocol (Grimm et al. 2020). This section provides a summary; the full ODD description is in Appendix A.

4.1.1 Purpose

The model investigates the relationship between agricultural price shocks and household enterprise entry, with heterogeneous responses by asset holdings and credit access. Households are classified as “stayers” (persistent entrepreneurs, >50% waves) or “copers” (intermittent responders).

4.1.2 Entities, State Variables, and Scales

Primary Entity: HouseholdAgent

Table 2: Household agent state variables.
Variable Type Description
household_id string Unique identifier
wave int Survey wave (1-4)
assets float Standardized asset index
credit_access int Binary credit access
enterprise_status int Binary enterprise status
price_exposure float Price shock exposure
classification string stayer/coper/none

Temporal Scale: Discrete time steps corresponding to survey waves (~2-year intervals).

Spatial Scale: No explicit spatial structure; agents respond independently to shared price distributions.

4.1.3 Process Overview

Each simulation step:

  1. Environment update: Price shock distribution for current wave
  2. Agent activation: All agents in random order
  3. Decision: Each agent queries policy for action
  4. State update: Agent applies action (ENTER/EXIT/NO_CHANGE)
  5. Data collection: Outcomes recorded

4.2 Design Concepts

4.2.1 Heterogeneity Without Interaction

Agents differ in assets, credit access, and initial enterprise status. These differences produce heterogeneous responses to common exogenous shocks. However, agents do not directly interact; there are no network effects, peer influences, or market feedback loops in the current implementation.

This design choice reflects our focus on household-level coping decisions rather than social contagion or general equilibrium effects. Aggregate patterns (enterprise prevalence, classification distributions) arise from the aggregation of heterogeneous individual responses (Epstein 2008).

4.2.2 Stochasticity

Sources of stochasticity include: - Price shock draws from calibrated distributions - Asset initialization (synthetic mode) - Agent activation order (Mesa RandomActivation)

Centralized RNG with recorded seeds ensures reproducibility.

4.3 Decision Policies

4.3.1 Rule-Based Policy (Executed)

The baseline RulePolicy implements deterministic threshold-based decisions:

  • Enter: price_exposure < price_threshold AND assets > asset_threshold AND NOT currently in enterprise
  • Exit: assets < exit_threshold AND currently in enterprise
  • No Change: Otherwise

4.3.2 LLM-Augmented Policy (Design Only)

We have designed a MultiSampleLLMPolicy architecture for future evaluation. Key features:

  • K samples at temperature T (default: K=5, T=0.6)
  • Constraint validation for feasibility
  • Majority vote aggregation with conservative tie-break
  • State-based caching for reproducibility

This policy has not been executed. All empirical results in this paper use the rule-based policy. See Discussion for planned LLM evaluation.

4.4 Architecture

flowchart TB
    subgraph DataPipeline["Data Pipeline"]
        LSMS["LSMS-ISA Data"]
        ETL["ETL Module"]
        Calibrate["Calibration"]
        CalJSON["calibration.json"]
    end

    subgraph ABMCore["ABM Core (Mesa 3)"]
        SynthGen["Synthetic Panel Generator"]
        Model["EnterpriseCopingModel"]
        Agents["HouseholdAgents"]
        Policy["RulePolicy"]
    end

    subgraph Outputs["Outputs"]
        Outcomes["household_outcomes.parquet"]
        Manifest["manifest.json"]
    end

    LSMS --> ETL
    ETL --> Calibrate
    Calibrate --> CalJSON
    CalJSON --> SynthGen
    SynthGen --> Model
    Model --> Agents
    Agents --> Policy
    Policy --> Outcomes
    Model --> Manifest
Figure 1: System architecture showing data flow from LSMS ingestion through calibration, simulation, and evaluation.

5 Experimental Design

5.1 Baseline Scenario

The baseline scenario uses LSMS-derived household data with the RulePolicy:

  • Country: Tanzania
  • N: 500 households
  • Waves: 4 (matching LSMS structure)
  • Seed: 42
  • Policy: RulePolicy with default thresholds

5.2 Parameter Sweeps

We conducted parameter sweeps over calibrated synthetic households to examine sensitivity:

  • Grid: 6×6 (price_threshold × asset_threshold)
  • Price threshold range: [-0.3, 0.0]
  • Asset threshold range: [-1.0, 1.0]
  • Seeds per cell: 2
  • N per run: 100 households
  • Total runs: 72

Data source: Calibrated synthetic (SyntheticPanelGenerator with CalibrationArtifact)

5.3 Behavior Exploration

We conducted random search over parameter space:

  • Candidates: 40
  • Seeds per candidate: 2
  • Objective: MSE between simulated and LSMS-derived enterprise rates

Target enterprise rates (LSMS-derived):

  • Wave 1: 19.6%
  • Wave 2: 26.0%
  • Wave 3: 28.6%
  • Wave 4: 28.4%

6 Results

6.1 Pattern Validation (LSMS-Derived Baseline)

6.1.1 Enterprise Prevalence

Figure 2: Enterprise rates by wave from LSMS-derived baseline simulation (Tanzania, N=500, seed=42). The model reproduces the increasing trend observed in LSMS data. Data source: outputs/tanzania/baseline/

6.1.2 Household Classification

Figure 3: Distribution of household classifications based on enterprise persistence. Stayers (>50% waves with enterprise) represent persistent entrepreneurs; copers show intermittent participation; none never operate enterprise. Data source: outputs/tanzania/baseline/

6.1.3 Path Dependence

Table 3: Transition matrix showing enterprise status persistence. High diagonal values indicate path dependence: households tend to maintain their enterprise status across waves. Data source: outputs/tanzania/baseline/
Transition Probabilities (Current → Next Wave)
In Enterprise Not in Enterprise
In Enterprise 100.0% 0.0%
Not in Enterprise 0.0% 100.0%

6.2 Parameter Sensitivity (Calibrated Synthetic)

6.2.1 Policy Threshold Heatmap

Figure 4: Enterprise rate sensitivity to policy thresholds. Heatmap shows mean enterprise rate across 6×6 parameter grid (2 seeds per cell). Lower price thresholds require stronger negative shocks to trigger entry. Data source: outputs/sweeps/calibrated/sweep_agg_latest.parquet

6.2.2 Behavior Exploration Results

Table 4: Top 5 parameter configurations from behavior exploration (40 candidates, 2 seeds each). Objective is MSE between simulated and LSMS enterprise rates. Data source: outputs/search/calibrated/candidates_latest.parquet
ID Price Thresh Asset Thresh Exit Thresh Objective (MSE)
6 0.014 0.395 -0.484 0.0017
24 0.027 -0.798 -1.883 0.0017
15 0.002 -0.338 -1.423 0.0022
33 0.081 1.226 -0.601 0.0023
8 -0.011 -0.916 -1.067 0.0028

7 Robustness and Diagnostics

7.1 Multi-Seed Variability

Figure 5: Enterprise rate variability across 10 replicate runs with different random seeds (LSMS-derived baseline, N=500). The narrow distribution indicates stable outcomes. Data source: outputs/batch/lsms/

Interpretation: With 10 seeds, the coefficient of variation (CV) for enterprise rate is less than 5%, indicating stable aggregate outcomes. However, 10 seeds is insufficient for robust inference; publication-quality analysis should use 30-100+ replications (Bankes 1993).

7.2 Regression Analysis

Table 5: Fixed effects regression of enterprise status on price exposure. The model supports the hypothesis that negative price shocks are associated with enterprise entry (negative coefficient expected). Data source: outputs/tanzania/baseline/
Term Coefficient Std..Error t.statistic p.value Sign.Match
price_exposure 0.0000 NaN NaN NaN No

8 Limitations and Future Work

8.1 Current Limitations

8.1.1 Model Architecture

  1. No Agent Interactions: Households respond independently to exogenous shocks. Social network effects, peer influence, and local market feedback are not modeled.

  2. Exogenous Price Shocks: Prices are drawn from calibrated distributions without feedback from enterprise activity. No general equilibrium effects.

  3. Binary Outcomes: Enterprise status is 0/1; enterprise type, size, and profitability are not modeled.

8.1.2 Calibration Limitations

  1. Heavy-Tailed Distributions: K-S tests reject normality for asset distributions. Future work should explore lognormal, Pareto, or generalized extreme value distributions (Clauset, Shalizi, and Newman 2009).

  2. Copula Dependence: The Gaussian copula captures linear dependence but may miss tail dependence important for extreme events.

8.1.3 Computational Limitations

  1. Exploratory Scale: Current analysis uses 10 seeds for robustness and 2 seeds per sweep cell. Publication-quality analysis requires 30-100+ replications.

  2. Behavior Search: 40 candidates with 2 seeds each is exploratory, not optimization. Results identify promising parameter regions but should not be interpreted as optimal configurations.

8.1.4 LLM Policy Status

The LLM-augmented policy has not been executed. All results use the rule-based baseline. LLM policy sections describe the implemented architecture and planned evaluation approach, not empirical findings.

8.2 Future Work

  1. LLM Policy Evaluation: Execute the MultiSampleLLMPolicy and compare to rule-based baseline and ML benchmarks.

  2. Cross-Country Validation: Test model calibrated on Tanzania against Ethiopia data.

  3. Extended Replications: Increase seed count to 50-100 for robust inference.

  4. Alternative Distributions: Fit heavy-tailed distributions (lognormal, Pareto) for assets.

  5. Agent Interactions: Introduce network effects and local market dynamics.

9 Conclusion

This paper presents an agent-based model of household enterprise entry as a coping mechanism under agricultural price shocks, calibrated to LSMS-ISA panel data from Tanzania and Ethiopia. The model successfully reproduces key empirical patterns including enterprise prevalence trends, household classification distributions, and path-dependent trajectories.

Our contribution is methodological: we demonstrate a generative microsimulation approach where distributional parameters are fitted from empirical data, enabling systematic exploration of parameter sensitivity through calibrated synthetic panels. The approach maintains clear provenance and reproducibility through manifest tracking and centralized random number generation.

We emphasize that current results are exploratory. The model lacks agent interactions that would be required for claims about complex adaptive dynamics. The LLM-augmented decision policy, while fully implemented, has not been executed due to computational constraints. Future work will address these limitations through expanded replication, alternative distributional families, and LLM policy evaluation.

The model provides a foundation for policy analysis of interventions targeting household enterprise coping strategies, including credit access expansion, price stabilization programs, and asset transfer schemes. Such analysis requires careful attention to the model’s epistemic boundaries and the distinction between model outputs and real-world predictions.

10 References

Bankes, Steven. 1993. “Exploratory Modeling for Policy Analysis.” Operations Research 41 (3): 435–49.
Clauset, Aaron, Cosma Rohilla Shalizi, and Mark EJ Newman. 2009. “Power-Law Distributions in Empirical Data.” SIAM Review 51 (4): 661–703.
Dercon, Stefan. 2002. “Income Risk, Coping Strategies, and Safety Nets.” The World Bank Research Observer 17 (2): 141–66.
Epstein, Joshua M. 2008. “Why Model?” Journal of Artificial Societies and Social Simulation 11 (4): 12.
Gilbert, Nigel. 2008. Agent-Based Models. Quantitative Applications in the Social Sciences. SAGE Publications.
Grimm, Volker, Uta Berger, Finn Bastiansen, Sigrunn Eliassen, Vincent Ginot, Jarl Giske, John Goss-Custard, et al. 2006. “A Standard Protocol for Describing Individual-Based and Agent-Based Models.” Ecological Modelling 198 (1-2): 115–26.
Grimm, Volker, Uta Berger, Donald L DeAngelis, J Gary Polhill, Jarl Giske, and Steven F Railsback. 2010. “The ODD Protocol: A Review and First Update.” Ecological Modelling 221 (23): 2760–68.
Grimm, Volker, Steven F Railsback, Christian E Vincenot, Uta Berger, Cara Gallagher, Donald L DeAngelis, Bruce Edmonds, et al. 2020. “The ODD Protocol for Describing Agent-Based and Other Simulation Models: A Second Update to Improve Clarity, Replication, and Structural Realism.” Journal of Artificial Societies and Social Simulation 23 (2).
Macal, Charles M, and Michael J North. 2010. “Tutorial on Agent-Based Modeling and Simulation.” Journal of Simulation 4 (3): 151–62.
Railsback, Steven F, and Volker Grimm. 2019. Agent-Based and Individual-Based Modeling: A Practical Introduction. 2nd ed. Princeton University Press.
World Bank. 2024. “Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA).” https://www.worldbank.org/en/programs/lsms.

11 Appendix A: Full ODD Description

See docs/abm_report.qmd for the complete ODD+D protocol description, including:

  • Detailed purpose and scope
  • Complete entity state variables
  • Process scheduling pseudocode
  • Design concepts (adaptation, learning, sensing, interaction, stochasticity)
  • Submodel specifications
  • Code references with line numbers

12 Appendix B: Data Provenance

12.1 Data Source Classification

Classification Code Description
LSMS-derived lsms Uses load_derived_targets() from processed LSMS
Calibrated synthetic calibrated Uses SyntheticPanelGenerator with CalibrationArtifact

12.2 Figure/Table Data Sources

Figure/Table Data Source Path
Figure 2 LSMS-derived outputs/tanzania/baseline/
Figure 3 LSMS-derived outputs/tanzania/baseline/
Table 3 LSMS-derived outputs/tanzania/baseline/
Figure 4 Calibrated synthetic outputs/sweeps/calibrated/
Table 4 Calibrated synthetic outputs/search/calibrated/
Figure 5 LSMS-derived outputs/batch/lsms/

12.3 Calibration Artifact

  • Path: artifacts/calibration/tanzania/calibration.json
  • Git commit: See artifact git_commit field
  • Created: See artifact created_at field

13 Appendix C: Reproduction Commands

See docs/REPRODUCIBILITY.md for complete environment setup and command reference.

Quick Start:

# Setup
make setup              # Python dependencies
make setup-r            # R dependencies

# Calibration
abm calibrate --country tanzania --data-dir data/processed

# Baseline simulation
make run-sim COUNTRY=tanzania

# Parameter sweep (calibrated)
python3 scripts/run_sweep.py --calibration artifacts/calibration/tanzania/calibration.json

# Behavior search (calibrated)
python3 scripts/run_behavior_search.py --calibration artifacts/calibration/tanzania/calibration.json --targets-from-lsms

# Render this document
quarto render docs/paper.qmd --to html

Document generated: 2026-01-14 Repository: abm-enterprise-coping